Bagging Multiple Comparisons from Microarray Data
نویسنده
چکیده
The problem of large-scale simultaneous hypothesis testing is revisited. Bagging and subagging procedures are put forth with the purpose of improving the discovery power of the tests. The procedures are implemented in both simulated and real data. It is shown that bagging and subagging significantly improve power at the cost of a small increase in false discovery rate with the proposed ‘maximum contrast’ subagging having an edge over bagging, i.e., yielding similar power but significantly smaller false discovery rates.
منابع مشابه
A Comparison of Ensemble Methods for Microarray Data Analysis
Machine Learning tools are increasingly being applied to analyze data from microarray experiments. These include ensemble methods where weighted votes of constructed base classifiers are used to classify data. We compare the performance of AdaBoost, bagging and BagBoost on gene expression data from the yeast cell cycle. AdaBoost was found to be more effective for the data than bagging. BagBoost...
متن کاملA Maximally Diversified Multiple Decision Tree Algorithm for Microarray Data Classification
We investigate the idea of using diversified multiple trees for Microarray data classification. We propose an algorithm of Maximally Diversified Multiple Trees (MDMT), which makes use of a set of unique trees in the decision committee. We compare MDMT with some well-known ensemble methods, namely AdaBoost, Bagging, and Random Forests. We also compare MDMT with a diversified decision tree algori...
متن کاملBagging to Improve the Accuracy of A Clustering Procedure
MOTIVATION The microarray technology is increasingly being applied in biological and medical research to address a wide range of problems such as the classification of tumors. An important statistical question associated with tumor classification is the identification of new tumor classes using gene expression profiles. Essential aspects of this clustering problem include identifying accurate p...
متن کاملA Robust Ensemble Classification Method for Microarray Data Analysis
Apart from the dimensionality problem, the uncertainty of Microarray data quality is another major challenge of Microarray classification. Microarray data contains various levels of noise and quite often are high levels of noise, and these data lead to unreliable and low accuracy analysis as well as the high dimensionality problem. In this paper, we propose a new Microarray data classification ...
متن کاملBagBoosting for tumor classification with gene expression data
MOTIVATION Microarray experiments are expected to contribute significantly to the progress in cancer treatment by enabling a precise and early diagnosis. They create a need for class prediction tools, which can deal with a large number of highly correlated input variables, perform feature selection and provide class probability estimates that serve as a quantification of the predictive uncertai...
متن کامل